Incorporating Knowledge of Source Language Text in a System for Dictation of Document Translations
نویسندگان
چکیده
This paper describes methods for integrating source language and target language information for machine aided human translation (MAHT) of text documents. These methods are applied to a language translation task involving a human translator dictating a first draft translation of a source language document. A method is presented which integrates target language automatic speech recognition (ASR) models with source language statistical machine translation (SMT) and named entity recognition (NER) information at the phonetic level. Information extracted from a source language document including translation model probabilities and translated named entities are combined with acoustic-phonetic information obtained from phone lattices produced by the ASR system. Phone-level integration allows the combined MAHT system to correctly decode words that are either not in the ASR vocabulary or would have been incorrectly decoded by the ASR system. It is shown that the combined MAHT system results in a decrease in word error rate on the dictated translations of 32% relative to a stand alone baseline ASR system.
منابع مشابه
Loss of the Socio-cultural Implicit Meanings in the English Translations of Mu’allaqat
Abstract Translation of literary texts, especially poetry, is one of the most difficult tasks; it requires mastery and knowledge of the language system and culture, and lack of this might lead to wrong translation. This study aimed to examine the loss and gain of the sociocultural implicit meanings in the English translations of the Mu’allaqat, and assess whether the translators of the Mu’allaq...
متن کاملLoss of the Socio-cultural Implicit Meanings in the English Translations of Mu’allaqat
Abstract Translation of literary texts, especially poetry, is one of the most difficult tasks; it requires mastery and knowledge of the language system and culture, and lack of this might lead to wrong translation. This study aimed to examine the loss and gain of the sociocultural implicit meanings in the English translations of the Mu’allaqat, and assess whether the translators of the Mu’allaq...
متن کاملIntegration of ASR and machine translation models in a document translation task
This paper is concerned with the problem of machine aided human language translation. It addresses a translation scenario where a human translator dictates the spoken language translation of a source language text into an automatic speech dictation system. The source language text in this scenario is also presented to a statistical machine translation system (SMT). The techniques presented in t...
متن کاملTransmission of Ideology through Translation: A Critical Discourse Analysis of Chomsky’s “Media Control” and its Persian Translations
Among factors that might manipulate translators’ mind while producing a text is the notion of ideology transmission through text or talk. Adopting Critical Discourse Analysis (CDA) with particular emphasis on the framework of Van Dijk (1999), the present investigation is an attempt to shed light on the relationship between language and ideology involved in translation in general, and more speci...
متن کاملAutomatic text dictation in computer-assisted translation
In this paper, we study the incorporation of statistical machine translation models to automatic speech recognition models in the framework of computer-assisted translation. The system is given a source language text to be translated and it shows the source text to the human translator to translate it orally. The system captures the user speech which is the dictation of the target language sent...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009